My Garlic Garden

Last year (2023), my coworker gave me some of his famous home-grown, hard-neck variety garlic. I planted nine of the cloves and harvest nine heads with 6 cloves each. This year (fall 2024), I’m planting 36 cloves, about 2/3rds of what I harvested this summer.

I’m tracking the weight of each clove at planting time to see if I can tell how clove weight at planting impacts the weight at harvest time.

This repo will help me track and analyze the data, while also serving as a tutorial in basic R programming skills for any interested viewers.

Weights at time of planting

grams_24 <- c(8, 17, 14, 20, 13, 13, 11, 15, 11,
              10, 19, 19, 19, 13, 17, 13, 15, 11,
              12, 19, 14, 13, 21, 8, 5, 13, 12, 13,
              17, 9, 8, 17, 9, 20, 7, 13)

Summaray of weights of planted cloves

summary(grams_24)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    5.00   11.00   13.00   13.56   17.00   21.00

Comparison to store-bought cloves

A couple cloves were less than 1 gram. Even the 2-gram cloves were so small that they might not be worth the effort when cooking for a family.

Since the cloves are replanted the same year the bulbs are harvested, I can use “year” to keep track of each generation. The 2023 cloves produced the 2024 bulbs which I “cracked” to get the 2024 cloves. Those will produce the 2025 bulbs, which…

store_grams_24 <- c(5, 2, 4, 3, 3, 8, 4, 3, 9, 4, 3, 2,
                    4, 4, 2, 2, 2, 3, 2, 2, 1, 1, 1)
comparison <- data.frame(sow_weight = c(grams_24, store_grams_24),
                     year = rep(2024, times = length(c(grams_24,
                                                             store_grams_24))),
                     source = c(rep("home-grown", times = length(grams_24)),
                                rep("store", times = length(store_grams_24))))
Long form vs wide form data

The data above is in “long form”, meaning that each row represents an individual (clove), and each column represents a property (sow_weight, source, year).

“Wide format” data is more common in spreadsheets, where this data would be shown as two columns: “home-grown weight 2024” and “store-bought weight 2024.” Each year, I’d have to add two more columns and change the code to process those new columns.

With “long form” data, we can use formulas to break down the data in one column based on values in another column. The formula sow_weight ~ source + sow_year tells the function to work with sow_weight but to group the data based on the volues in the source and sow_year columns.

Formulas in R

See how the formula automatically processes one column of sow_weights into two boxplots based on the values in the other columns.

formula <- sow_weight ~ source + year
boxplot(data = comparison, formula, ylab = "weight (grams)")

Conclusion

From the boxplots above and the summary table below, you can see why I like this garlic variety: the cloves are more than 4 times as big! A lot easier to handle when cooking for a family.

aggregate(data = comparison, formula,
          FUN = summary, digits = 2) #digits is passed to the summary function
##       source year sow_weight.Min. sow_weight.1st Qu. sow_weight.Median
## 1 home-grown 2024             5.0               11.0              13.0
## 2      store 2024             1.0                2.0               3.0
##   sow_weight.Mean sow_weight.3rd Qu. sow_weight.Max.
## 1            14.0               17.0            21.0
## 2             3.2                4.0             9.0

Planting Strategy

Garlic is generally planted in the fall and harvested in the summer. In my garden near the Denver metro, that means mid October to early to mid July.

Several factors, not just clove weight, can impact growth: uneven soil/nutrient densities, irrigation/drainage, and wind/sun.

The location that each clove was planted in can serve as a proxy for those factors. The plants on the perimeter of the raised bed will experience more drainage. Certain sides of the box may get more or less light/shading as the rest of the garden grows.

I don’t want my planting strategy to confound these factors with weight. If I plant the large cloves in one spot and the small ones in another, how will I know if the size or the location is causing the difference?

I used R to help me randomize the planting locations. Assigned each clove an ID (number 1 through 36) then had R randomize the order.

id <- 1:36
randomized_example <- sample(id)
randomized_example
##  [1] 15 18  9 16 17 21 25 13 27  7 36 33 23 26 24 11  4  8 32 31  6 29  1  3 28
## [26] 22 20 34 30 12 10  2 14  5 19 35
planting_order <- c(23, 18, 2, 36, 3, 27, 12, 11, 1,
                    35, 6, 21, 9, 24, 28, 22, 25, 31,
                    33, 8, 4, 15, 14, 26, 5, 16, 13,
                    7, 17, 10, 19, 20, 32, 29, 34, 30)

The randomized_example above is different every time you load this page. (As it should be. It’s random!)

The randomization I used is recorded as planting_order. Clove #23 is at the North East corner. Clove #30 is in the South West corner of the raised bed.

dimensions <- c(4, 9) #East to West, North to South
row_from_east <- c(rep(c(1:dimensions[1]), each = dimensions[2]))
row_from_north <- c(rep(1:dimensions[2], times = dimensions[1]))

garlic <- data.frame(id = planting_order,
                     from_east = row_from_east,
                     from_north = row_from_north,
                     year = rep(2024, times = length(id)))

I’d like to merge the position information in the garlic dataframe with the weights in the grams_24 vector.

Since garlic is randomly ordered, we need to re-order it by id to match the order of the weigths in the grams_24 vector.

head(garlic) #randomized IDs
##   id from_east from_north year
## 1 23         1          1 2024
## 2 18         1          2 2024
## 3  2         1          3 2024
## 4 36         1          4 2024
## 5  3         1          5 2024
## 6 27         1          6 2024
garlic <- garlic[order(garlic$id), ]
head(garlic) #sorted by IDs
##    id from_east from_north year
## 9   1         1          9 2024
## 3   2         1          3 2024
## 5   3         1          5 2024
## 21  4         3          3 2024
## 25  5         3          7 2024
## 11  6         2          2 2024
#create new column using data from vector
garlic$weight <- grams_24

#renumber the rowname numbers
rownames(garlic) <- 1:nrow(garlic)
head(garlic)
##   id from_east from_north year weight
## 1  1         1          9 2024      8
## 2  2         1          3 2024     17
## 3  3         1          5 2024     14
## 4  4         3          3 2024     20
## 5  5         3          7 2024     13
## 6  6         2          2 2024     13

Here’s a diagram of where each clove was planted.